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Current situation 
ə Complex games/benchmarks are becoming available on Linux; 


ə Drivers are getting more complex as performance improves; 


ə Users now rely on Open Source drivers for performance. 


Risks when merging new code 


ə Break previous functionalities / rendering; 
ə Break the performance of a game inadvertly; 


ə Improve the performance of one game but slow down others. 
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Review does not catch everything - Real-life example 


@@ -340,6 +340,10 @@ is_color_fast_clear_compatible(struct brw_context *brw, 
const union gl_color_union *color) 


a 
if (_mesa_is_format_integer_color (format) ) 
if (brw->gen >= 8) { 
perf_debug("Integer fast clear not enabled for (%s)", 
_mesa_get_format_name(format)); 


+++ + 


} 


return false; 


ə Up to 10% regression in some benchmarks; 


ə Took 13 days for the fix to reach upstream. 
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Introduction - Need for benchmarking 


It is impossible to predict performance 
Some factors affecting the performance: 


ə Data-and-code alignment and cache hierarchy/size; 
CPU and GPU schedulers; 

Samplers configuration; 

Hardware generation; 


Power budgets. 


=> Need to benchmark all the platforms and games of interest. | 
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@ Automating benchmarking 
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Different needs for benchmarking 


ə Developers: Run multiple experiments and compare them; 
ə QA: 


e Test patch series before they hit mainline; 
ə Follow performance trends on mainline; 
ə Create performance retrospective. 
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Pitfalls of benchmarking 


Intra- and inter-runs variance is variable between benchmarks; 


Hitting the power budget, a thermal limit or GPU reset; 
Being able to reproduce the different test results; 
Not using the expected libraries; 


Comparing runs generated using a different environment: 
ə Kernel, libdrm and mesa’s version and config; 
ə Display server used (and its configuration); 
ə Hardware and BIOS versions. 
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Pitfalls 


The variance forces us to execute multiple runs, which takes time! | 


Intra-run variance due to 
ə Power management (Boost-like features, thermal throttling); 
ə Concurrent tasks generating lOs, CPU or GPU load; 
ə Interrupts from the hardware; 
ə CPU/GPU schedulers. 


Inter-run variance due to 


ə Variations in the memory allocation. 
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CPU-limited cases 
ə Force the CPU to one single frequency; 


© 


Pin the game/benchmark to a single core; 
ə Disable ASLR and transparent huge pages; 
ə Run as little services as possible; 

ə Pin IRQs to another core; 


© 


Properly cool the device. 


GPU-limited cases 
ə Force the GPU to one frequency; 


ə Reduce the number of active GPU contexts; 


ə Properly cool the device. 
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Problem 
If we change the environment, we skew the results! 


Be smart! 
Only get rid of what you are not trying to optimise or track! 


Many variables to check, track and remember! 
We need to help the developers and QA by automating all we can! 
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Objectives of automated benchmarking 
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e Avoid or detect human errors; 


© 


Make sure the data is valid; 


© 


Be predictable in the execution time; 


© 


Provide as much information as possible; 


© 


Guarantee reproducibility of the results. 


concrete goals 

ə Be aware of every library used by the program; 

ə Know their versions, git ID and compilation flags; 
ə Poll on the resources’ usage metrics; 

ə Store all this information inside the report; 


ə Understand performance results and act upon them. 
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Automating benchmarking 


ə Compute the statistical accuracy and add runs if needed; 

ə Get information out from the kernel about major hw events; 
ə Learn to give up and re-prioritise other benchmarks; 

ə Try to reproduce runs and detect major differences; 

ə Reboot the machine if unsure about the results; 

ə Collect usage metrics of the resources; 


Log all this information in the report. 


Bisect performance changes automatically 


ə It adds credibility to the report; 
ə It also reproduces the issue. 
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Guaranteeing reproducibility - Why? 


ə Allow developers to reproduce a performance regression. 


Challenges 


ə How do we detect the entire environment of the benchmark? 
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Listing dependencies 


ə Using Idd is insufficient because of run-time dependencies; 
ə Strace is the most robust approach but it is slow; 


ə Linked libraries can be listed in /proc/pid/maps. 


Query the version of a library/program 
ə No silver bullet; 


ə Can sometimes be read out of a program (Linux); 


ə Will often require controlling the build process. 
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Phoronix Testing Suite - Pros 


ə Automates data acquisition; 


e@ Collects some useful metrics. 


Phoronix Testing Suite - Cons 


ə Oriented towards simple reporting, no good for performance 
analysts; 


ə Reads out the environment but with no guarantees; 
ə Hides performance data; 
ə Not git-centric. 
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Ben Widawsky’s tool - Pros 
oe Strong modelling effort to validate the reported values; 
ə Detects some hardware events and invalidates data; 


ə Great for developer experiments, not for QA. 


Ben Widawsky’s tool - Cons 
Non-build- and non-git-aware; 


© 


Not aware of the environment; 
Supports a limited amount of benchmarks; 


© 


© 


© 


Requires a lot of manual work to test big series. 
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Automated benchmarking - What tools? 


Ideal system 
Manages the build system, the commit history and the 
environment of benchmarks while allowing metrics collection. 

Should provide a visual report that eases performance analysis. 
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@ Overview 
@ Architecture and features 
@ Demo 
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Ezbench - Goals 


ə Provide workflows and automation to take care of most issues; 


ə Provide a framework quickly adaptable to your needs; 
ə Work for both QA and developers! 


Authors: Martin Peres (Intel) & Chris Wilson (Intel); 
Licence: MIT; 
Url: http: //cgit.freedesktop.org/~mperes/ezbench/ 
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Architecture and features 


ə Modular architecture (profiles, tests and user hooks); 
ə Automates the acquisition of benchmark data; 
ə Generates a report that is usable by developers; 


ə Bisects performance changes automatically; 


ə Provides python bindings to acquire data and parse them. 


Be crash-resistant by storing the expected goal and comparing 
it to the current state; 


ə Use a modelling approach to detect performance changes; 


Detect the environment. 
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EzBench - Features 


ə Detect HW events and react to them; 

oe Predict run times more accurately; 

ə Support deadlines and test prioritisation; 

ə Support sending emails to the authors of perf changes; 


ə Integrate with patchwork to test patch series. 
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Demo 


EzBench - Demo time! 


Demo time and questions! | 


